Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ObjectStore.LocalFileSystem.put_opts for blobfuse #5094

Merged
merged 5 commits into from
Nov 27, 2023

Conversation

RobinLin666
Copy link
Contributor

@RobinLin666 RobinLin666 commented Nov 18, 2023

Which issue does this PR close?

When we try to write to mounted path with blobfuse, we'll need to update object_store to explicitly close (drop) the file before calls to std::fs::rename, otherwise the metadata is not flushed in time for the rename.
Blobfuse may be for the sake of perf, so the upload file is only available when close file. Refer: https://github.com/Azure/azure-storage-fuse/blob/master/blobfuse/fileapis.cpp#L346

delta-io/delta-rs#1765

Rationale for this change

we'll need to update object_store to explicitly close (drop) the file before calls to std::fs::rename, otherwise the metadata is not flushed in time for the rename.

What changes are included in this PR?

sync the file before renaming.

Are there any user-facing changes?

no

@github-actions github-actions bot added the object-store Object Store Interface label Nov 18, 2023
@RobinLin666 RobinLin666 marked this pull request as ready for review November 18, 2023 08:14
@tustvold
Copy link
Contributor

What is the motivation for using blobfuse over the first-class support Azure Storage, AFAICT fuse is just adding overhead (and race conditions)

@@ -1064,6 +1064,8 @@ pub enum PutMode {
/// Perform an atomic write operation if the current version of the object matches the
/// provided [`UpdateVersion`], returning [`Error::Precondition`] otherwise
Update(UpdateVersion),
/// Only for Mounted path, drop the file before renaming so that the file will upload.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking API change is there some way we can avoid this?

Copy link
Contributor Author

@RobinLin666 RobinLin666 Nov 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can directly modify the option of Override?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are adding a variant to public enumeration, this is a breaking change

@RobinLin666
Copy link
Contributor Author

What is the motivation for using blobfuse over the first-class support Azure Storage, AFAICT fuse is just adding overhead (and race conditions)

Mount is a very common usage scenario in Microsoft Fabric notebook. And it's use blobfuse to mount ADLS/OneLake storage. The benefits of using mount can be imagined. For example, users can use Azure Storage like local files. Uniform Access, Simplified Data Movement, Ease of Management, etc.

@tustvold
Copy link
Contributor

tustvold commented Nov 18, 2023

The problem is blobfuse does not behave like a filesystem (or an object store). In particular, as was commented on the linked issue, hard links are not supported and rename is a non-atomic copy followed by a delete. LocalFilesystem relies on these primitives to provide object store behaviour using a filesystem. https://docs.rs/object_store/latest/object_store/#why-not-a-filesystem-interface might be relevant here.

I think you will need to use the object_store Azure abstraction, the performance will be orders of magnitude better and you won't run into these sorts of issues. Nothing prevents you from using blobfuse in addition within your notebooks, but workloads using object_store should be configured to talk to object storage directly

Edit: marking as draft for now

@tustvold tustvold added the api-change Changes to the arrow API label Nov 18, 2023
@tustvold tustvold marked this pull request as draft November 18, 2023 21:16
@tustvold tustvold removed the api-change Changes to the arrow API label Nov 18, 2023
@RobinLin666
Copy link
Contributor Author

Hi @tustvold , thank you very much for your suggestion.
I think this line has no impact on the atomicity of the whole function, but it may affect the performance and error handling.

std::mem::drop(file);

This line releases the ownership of the file variable, but does not delete or modify the file itself. The file system is only affected when std::fs::rename or std::fs::hard_link are called. These operations are atomic, meaning they either succeed or fail without leaving any intermediate state. Therefore, the code is safe and does not cause file corruption or partial write.

However, this line may have some impact on the performance and error handling. If this line is not added, the file variable will be automatically dropped at the end of the function, which may delay some time. If this line is added, the file variable will be dropped immediately, which may improve some performance. But this also means that the file variable cannot be used for any further operations, such as checking for write errors, or closing the file.

BTW, the put method of object_store Azure abstraction also sends an http request, which is the same behavior as blobfuse, and blobfuse also sends some http requests. However, in this scenario, due to the implementation of local.rs, blobfuse will send some more requests, such as rename, etc. But this does not affect the atomicity of the whole put behavior. Yes, that’s right, in terms of performance, it may indeed be worse than using azure abstraction directly, but for some reason, users may indeed want to use mount to operate (some of the advantages of mount have been mentioned above) and do not care about this point of performance impact. I think we should support this user operation. What do you think?

If I say something wrong, please correct me, because I just started to learn arrow. Thanks!

@tustvold
Copy link
Contributor

If there is a way to provide this without breaking existing workloads I suppose we could accomodate this. Although to be completely unambiguous I consider this an anti-pattern and would strongly discourage people from trying to pretend object stores are filesystems, it is a deeply problematic approach

@@ -368,7 +371,7 @@ impl ObjectStore for LocalFileSystem {
return Err(err.into());
}

let metadata = file.metadata().map_err(|e| Error::Metadata {
let metadata = metadata(&path).map_err(|e| Error::Metadata {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now racy in the event of a concurrent put

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, fixed.

Ok(_) => None,
Err(source) => Some(Error::UnableToRenameFile { source }),
}
}
PutMode::Create => match std::fs::hard_link(&staging_path, &path) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this behave correctly on blobfuse, I'm not sure it does

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, blobfuse,s3fs and goofys all don't support hard link. But they can choose override mode.

@RobinLin666
Copy link
Contributor Author

If there is a way to provide this without breaking existing workloads I suppose we could accomodate this. Although to be completely unambiguous I consider this an anti-pattern and would strongly discourage people from trying to pretend object stores are filesystems, it is a deeply problematic approach

Thanks, I actually agree with what you said, putting a Filesystem on Top of an Object Store is a Bad Idea.
But sometimes users may not have high requirements for performance, or they may only be able to use the traditional POSIX protocol. This is also one of the meanings of mount. So I think we should not block this usage. Thanks

e_tag = Some(get_etag(&metadata));
match opts.mode {
PutMode::Overwrite => {
std::mem::drop(file);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get a comment explaining the purpose of this to future readers, otherwise it is likely to be accidentally removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, thanks

let _ = std::fs::remove_file(&staging_path); // Attempt to cleanup
None
Ok(_) => {
let metadata = file.metadata().map_err(|e| Error::Metadata {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work with blobfuse, the original description suggested the file had to be closed before the metadata could be retrieved?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it works with blobfuse. Because when it sends stat system call, blobfuse will look for local file cache firstly and it will success.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you confirm if this works

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have tested it in local and it worked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tustvold, you can see https://github.com/Azure/azure-storage-fuse/blob/master/blobfuse/utilities.cpp#L136
image

For metadate request, blobfuse will check local cache firstly and skip remote request. It's different with rename requests.

@RobinLin666 RobinLin666 marked this pull request as ready for review November 20, 2023 02:09
@tustvold tustvold marked this pull request as draft November 26, 2023 20:27
@tustvold
Copy link
Contributor

Marking as draft as waiting on feedback, please feel free to mark ready for review when you would like me to take another look

@RobinLin666
Copy link
Contributor Author

Marking as draft as waiting on feedback, please feel free to mark ready for review when you would like me to take another look

Hi @tustvold , any question needs I answer/feedback? Please take a look again, thanks!

@tustvold
Copy link
Contributor

#5094 (comment)

@RobinLin666
Copy link
Contributor Author

#5094 (comment)

Thanks, I have answered. Please check.

@tustvold
Copy link
Contributor

I am not seeing anything 😅

@tustvold
Copy link
Contributor

Review comments, listing as "pending", aren't submitted until you complete your review

@tustvold tustvold marked this pull request as ready for review November 27, 2023 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
object-store Object Store Interface
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants